Techniques for annotating portions of a document relevant to concepts of interest

ABSTRACT

An automatic reading assistance application for documents available in electronic form. An automatic annotator is provided which finds concepts of interest and keywords. The operation of the annotator is personalizable for a particular user. The annotator is also capable of improving its performance overtime by both automatic and manual feedback. The annotator is usable with any electronic document. Another available feature is a thumbnail image of all or part of a multi-page document wherein a currently displayed section of the document is highlighted in the thumbnail image. Movement of the highlighted area in the thumbnail image is then synchronized with scrolling through the document.

BACKGROUND OF THE INVENTION

[0001] The present invention relates to display of electronic documentsand more particularly to method and apparatus for augmenting electronicdocument display with features to enhance the experience of reading anelectronic document on a display.

[0002] Increasingly, readers of documents are being called upon toassimilate vast quantities of information in a short period of time. Tomeet the demands placed upon them, readers find they must read documents“horizontally,” rather than “vertically,” i.e., they must scan, skim,and browse sections of interest in multiple documents rather than readand analyze a single document from beginning to end.

[0003] Documents are now more and more available in electronic form.Some documents are available electronically by virtue of their havingbeen locally created using word processing software. Other electronicdocuments are accessible via the Internet. Yet others may becomeavailable in electronic form by virtue of being scanned in, copied, orfaxed. See commonly assigned U.S. application Ser. No. 08/754,721,entitled AUTOMATIC AND TRANSPARENT DOCUMENT ARCHWING, the contents ofwhich are herein incorporated by reference.

[0004] However, the mere availability of documents in electronic formdoes not assist the reader in confronting the challenges of assimilatinginformation quickly. Indeed, many time-challenged readers still preferpaper documents because of their portability and the ease of flippingthrough pages.

[0005] Certain tools exist to take advantage of the electronic formdocuments to assist harried readers. Tools exist to search for documentsboth on the Internet and locally. However, once the document isidentified and retrieved, further search capabilities are limited tokeyword searching. Automatic summarization techniques have also beendeveloped but have limitations in that they are not personalized. Theysummarize based on general features found in sentences.

[0006] What is needed is a document display system that helps the readerfind as well as assimilate the information he or she wants more quickly.The document display system should be easily personalizable and flexibleas well.

SUMMARY OF THE INVENTION

[0007] An automatic reading assistance application for documents inelectronic form is provided by virtue of the present invention. Incertain embodiments, an automatic annotator is provided which findsconcepts of interest and keywords. The operation of the annotator ispersonalizable for a particular user. The annotator is also capable ofimproving its performance overtime by both automatic and manualfeedback. The annotator is usable with any electronic document. Anotheravailable feature is a elongated thumbnail image of all or part of amulti-page document wherein a currently displayed section of thedocument is emphasized in the elongated thumbnail image. Movement of theemphasized area in the elongated thumbnail image is then synchronizedwith scrolling through the document.

[0008] In accordance with a first aspect of the present invention, amethod for annotating an electronically stored document includes stepsof: accepting user input indicating user-specific concepts of interest,analyzing the electronic document to identify locations of discussion ofthe user-specific concepts of interest, and displaying the electronicdocument with visual indications of the identified locations.

[0009] In accordance with a second aspect of the present invention, amethod for displaying a multi-page document includes steps of:displaying a elongated thumbnail image of a multi-page document in afirst viewing area of a display, displaying a section of the multi-pagedocument in a second viewing area of the display in legible form,emphasizing an area of the elongated thumbnail image corresponding tothe section displayed in the second viewing area, accepting user inputcontrolling sliding of the emphasized area through the thumbnail image,and scrolling the displayed section through the second viewing arearesponsive to the scrolling so that the emphasized area continues tocorrespond to the displayed section.

[0010] A further understanding of the nature and advantages of theinventions herein may be realized by reference to the remaining portionsof the specification and the attached drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011]FIG. 1 depicts a representative computer system suitable forimplementing the present invention.

[0012] FIGS. 2A-2D depict document browsing displays in accordance withone embodiment of the present invention.

[0013]FIG. 3 depicts a document summary view in accordance with oneembodiment of the present invention.

[0014]FIG. 4 depicts a table of contents view in accordance with oneembodiment of the present invention.

[0015]FIG. 5 depicts a top-level software architectural diagram forautomatic annotation in accordance with one embodiment of the presentinvention.

[0016] FIGS. 6A-6C depict a detailed software architectural diagram forautomatic annotation in accordance with one embodiment of the presentinvention.

[0017]FIG. 7 depicts a representative Bayesian belief network useful inautomatic annotation in accordance with one embodiment of the presentinvention.

[0018]FIG. 8 depicts a user interface for defining a user profile inaccordance with one embodiment of the present invention.

[0019] FIGS. 9A-9B depict an interface for providing user feedback inaccordance with one embodiment of the present invention.

[0020]FIG. 10 depicts a portion of an HTML document processed inaccordance with one embodiment of the present invention.

DESCRIPTION OF SPECIFIC EMBODIMENTS

[0021] Computer System Usable for Implementing the Present Invention

[0022]FIG. 1 depicts a representative computer system suitable forimplementing the present invention. FIG. 1 shows basic subsystems of acomputer system 10 suitable for use with the present invention. In FIG.1, computer system 10 includes a bus 12 which interconnects majorsubsystems such as a central processor 14, a system memory 16, aninput/output controller 18, an external device such as a printer 20 viaa parallel port 22, a display screen 24 via a display adapter 26, aserial port 28, a keyboard 30, a fixed disk drive 32 and a floppy diskdrive 33 operative to receive a floppy disk 33A. Many other devices maybe connected such as a scanner 34 via I/O controller 18, a mouse 36connected to serial port 28 or a network interface 40. Many otherdevices or subsystems (not shown) may be connected in a similar manner.Also, it is not necessary for all of the devices shown in FIG. 1 to bepresent to practice the present invention, as discussed below. Thedevices and subsystems may be interconnected in different ways from thatshown in FIG. 1. The operation of a computer system such as that shownin FIG. 1A is readily known in the art and is not discussed in detail inthe present application. Source code to implement the present inventionmay be operably disposed in system memory 16 or stored on storage mediasuch as a fixed disk 32 or a floppy disk 33A. Image information may bestored on fixed disk 32.

[0023] Annotated Document User Interface

[0024] The present invention provides a personalizable system forautomatically annotating documents to locate concepts of interest to aparticular user. FIG. 2A depicts one user interface 200 for viewing adocument that has been annotated in accordance with the presentinvention. A first viewing area 202 shows a section of an electronicdocument. Using a scroll bar 204, or in other ways, the user may scrollthe displayed section through the electronic document.

[0025] A series of concept check boxes 206 permit the user to selectwhich concepts of interest are to be noted in the document. Asensitivity control 208 permits the user to select the degree ofsensitivity to apply in identifying potential locations of relevantdiscussion. At low sensitivity, more locations will be denoted as beingrelevant, even though some may not be of any actual interest. At highsensitivity, most all denoted locations will in fact be relevant butsome other relevant locations may be missed. After each concept nameappearing by one of checkboxes 206 appears a percentage giving therelevance of the currently viewed document to the concept. Theserelevance levels offer a quick assessment of the relevance of thedocument to the selected concepts. FIG. 2A shows no annotations becausea plain text view rather than an annotated view has been selected forfirst viewing area 202.

[0026] A thumbnail view 214 of the entire document is found in a secondviewing area 215. Details of thumbnail view 214 will be discussed ingreater detail below.

[0027] Miscellaneous navigation tools are found on a navigation toolbar216. Miscellaneous annotation tools are found on an annotation toolbar218. The annotation tools on annotation toolbar 218 facilitatenavigation through a collection of documents.

[0028] According to the present invention, annotations may be added tothe text displayed in first viewing area 204. The annotations denotetext relevant to user-selected concepts. As will be explained furtherbelow, an automatic annotation system according to the present inventionadds these annotations to any document available in electronic form. Thedocument need not include any special information to assist in locatingdiscussion of concepts of interest.

[0029]FIG. 2B depicts the document view of FIG. 2A but with annotationadded in first viewing area 202. Phrases 220 have been highlighted toindicate that they relate to concepts of interest to the user. Thehighlighting is preferably color. However, for ease of illustration inblack-and-white format, rectangles indicate the highlighted areas oftext. For further emphasis, the highlighted text is preferably printedin bold. A rectangular bar 222 indicates a paragraph that has beendetermined to have relevance above a predetermined threshold or to havemore than a threshold number of key phrases. Rectangular bar 222 ismerely representative of various forms of marginal annotation that mightbe used to indicate a relevant section of the text.

[0030]FIG. 2C depicts an alternative style of annotation. Now in firstviewing area 202, entire sentences 224 including phrases relevant toconcepts of interest are highlighted. The phrases themselves are printedin bold text. It has been found that highlighting the entire sentencerather than just a relevant phrase provides the user with far moreinformation at a glance.

[0031]FIG. 2D depicts how further information about key phrases may bedisplayed. The user may select any highlighted key phrase with themouse. Upon selection of the key phrase, a balloon 226 appears. Theballoon includes further information relevant to the key phrase. Forexample, the balloon may include the name of the concept to which thekeyword is relevant. The balloon may also include bibliographicinformation if the key phrase includes a citation.

[0032]FIG. 3 depicts a document summary view in accordance with oneembodiment of the present invention. The user may optionally select asummary view 300 of the document. Summary view lists the concepts ofinterest 302 that are found in the documents as headings of an outline.For each concept, keywords or key phrases 304 are listed which areindicative of the concept of interest. A number in parenthesis by eachkeyword indicates the number of times the keyword or key phrase appears.Each concept also has an associated score 306 indicative of therelevance of the whole document to the concept.

[0033]FIG. 4 depicts a table of contents view in accordance with oneembodiment of the present invention. An alternative to summary view 300is a table of contents view 400. Table of contents view 400 lists majorheadings 402 and subheadings 403 of the electronic document. Byselecting one of hierarchical display icons 404, the user may list theconcepts 406 found under one of the document headings 402 or subheadings403 with an indication of relevance for each concept and the number ofkeywords found. There is also a relevance meter 408 for each documentheading 402 that indicates the overall relevance of the text under thatheading for all of the currently selected concepts. In a preferredembodiment where the document is an HTML document, to createtable-of-contents view 400, the headings of the document are identifiedby an analysis of the HTML heading tags.

[0034] Automatic Annotation Software

[0035]FIG. 5 depicts a top-level software architectural diagram forautomatic annotation in accordance with one embodiment of the presentinvention. A document 502 exists in electronic form. It may have beenscanned in originally. It may be, e.g., in HTML, Postscript, LaTeX,other word processing or e-mail formats, etc. The description thatfollows assumes an HTML format. A user 504 accesses document 502 througha document browser 506 and an annotation agent 508. Document browser 506is preferably a hypertext browsing program such as Netscape Navigator orMicrosoft Explorer but also may be, e.g., a conventional word processingprogram.

[0036] Annotation agent 508 adds the annotations to document 502 toprepare it for viewing by document browser 506. Processing by annotationagent 508 may be understood to be in three stages, a text processingstage 510, a content recognition stage 512, and a formatting stage 514.The input to text processing stage 510 is raw text. The output from textprocessing stage 510 and input to content recognition stage 512 is aparsed text stream, a text stream with formatting information such asspecial tags around particular words or phrases removed. The output fromcontent recognition stage 512 and input to formatting stage 514 is anannotated text stream. The output of formatting stage 514 is a formattedtext file viewable with document browser 506.

[0037] The processing of annotation agent 508 is preferably a run-timeprocess. The annotations are not preferably pre-inserted into the textbut are rather generated when user 504 requests document 502 forbrowsing. Thus, this is preferably a dynamic process. Annotation agent508 may also, however, operate in the background as a batch process.

[0038] The annotation added by annotation agent 508 depends on conceptsof interest selected by user 504. User 504 also inputs information usedby annotation agent 508 to identify locations of discussion of conceptsof interest in document 502. In a preferred embodiment, this informationdefines the structure of a Bayesian belief network. The concepts ofinterest and other user-specific information are maintained in a userprofile file 516. User 504 employs a profile editor 518 to modify thecontents of user profile file 516.

[0039]FIG. 6A depicts the automatic annotation software architecture ofFIG. 5 with text processing stage 510 shown in greater detail. FIG. 6Ashows that the source of document 502 may be accessed via a network 602.Possible sources include e.g., the Internet 604, an intranet 606, adigital copier 608 that captures document images, or other officeequipment 610 such as a fax machine, scanner, printer, etc. Anotheralternative source is the user's own hard drive 32.

[0040] Text processing stage 510 includes a file I/O stage 612, anupdating stage 614, and a language processing stage 616. File I/O stagereads the document file from network 602. Updating stage 614 maintains ahistory of recently visited documents in a history file 618. Languageprocessing stage 616 parses the text of document 502 to generate theparsed text output of text processing stage 510.

[0041]FIG. 6B depicts the automatic annotation software architecture ofFIG. 5 with content recognition stage 512 shown in greater detail. Apattern identification stage 620 looks for particular patterns in theparsed text output of text processing stage 510. The particular patternssearched for are determined by the contents of user profile file 516.Once the patterns are found, annotation tags are added to the parsedtext by an annotation tag addition stage 622 to indicate the patternlocations. In a preferred HTML embodiment, these annotation tags arecompatible with the HTML format. However, the tagging process may beadapted to LaTeX, Postscript, etc. A profile updating stage 624 monitorsthe output of annotation tag addition stage 622 and analyzes textsurrounding the locations of concepts of interest. As will be furtherdiscussed with reference to FIG. 7 changes the contents of user profilefile 516 based on the analysis of this surrounding text. The effect isto automatically refine the patterns searched for by patternidentification stage 620 to improve annotation performance.

[0042]FIG. 6C depicts the automatic annotation software architecture ofFIG. 5 with formatting stage 514 shown in greater detail. Formattingstage 514 includes a text rendering stage 626 that formats the annotatedtext provided by content recognition stage 512 to facilitate viewing bydocument browser 506. An HTML document as modified by formatting stage514 is discussed in greater detail with reference to FIG. 10.

[0043] Pattern identification stage 620 looks for keywords and keyphrases of interest and locates relevant discussion of concepts based onthe located keywords. The identification of keywords and the applicationof the keywords to locating relevant discussion is preferablyaccomplished by reference to a belief system. The belief system ispreferably a Bayesian belief network.

[0044]FIG. 7 depicts a portion of a representative Bayesian beliefnetwork 700 implementing a belief system as used by patternidentification stage 622. A first oval 702 represents a particularuser-specified concept of interest. Other ovals 704 representsubconcepts related to the concept identified by oval 702. Each linebetween one of subconcept ovals 704 and concept oval 702 indicates thatdiscussion of the subconcept implies discussion of the concept. Eachconnection between one of subconcept ovals 704 and concept oval 702 hasan associated probability value indicated in percent. These values inturn indicate the probability that the concept is discussed given thepresence of evidence indicating the presence of the subconcept.Discussion of the subconcept is in turn indicated by one or morekeywords or key phrases (not shown in FIG. 7).

[0045] The structure of Bayesian belief network 700 is only one possiblestructure applicable to the present invention. For example, one couldemploy a Bayesian belief network with more than two levels of hierarchyso that the presence of subconcepts is suggested by the presence of“subsubconcepts” and so on. In the preferred embodiment, presence of akeyword or key phrase always indicates presence of discussion of thesubconcept but it is also possible to configure the belief network sothat presence of a keyword or key phrase suggests discussion of thesubconcept with a specified probability.

[0046] The primary source for the structure of Bayesian belief network700 including the selection of concepts, keywords and key phrases,interconnections, and probabilities is user profile file 516. In apreferred embodiment, user profile file 516 is selectable for bothediting and use from among profiles for many users.

[0047] The structure of belief system 700 is however also modifiableduring use of the annotation system. The modifications may occurautomatically in the background or may involve explicit user feedbackinput. The locations of concepts of interest determined by patternidentification stage 620 are monitored by profile updating stage 624.Profile updating stage 624 notes the proximity of other keywords and keyphrases within each analyzed document to the locations of concepts ofinterest. If particular keywords and key phrases are always near aconcept of interest, the structure and contents of belief system 700 areupdated in the background without user input by profile updating stage624. This could mean changing probability values, introducing a newconnection between a subconcept and concept, or introducing a newkeyword or key phrase.

[0048] User 504 may select a word or phrase in document 502 as beingrelevant to a particular concept even though the word or phrase has notyet defmed to be a keyword or key phrase. Belief system 700 is thenupdated to include the new keyword or key phrase

[0049] User 504 may also give feedback for an existing key word or keyphrase, indicating the perceived relevance of the keyword or key phraseto the concept of interest. If the selected keyword or key phrase isindicated to be of high relevance to the concept of interest, theprobability values connecting the subconcept indicated by the selectedkeywords or key phrases to the concept of interest increases. If, on theother hand, user 504 indicates the selected keywords or key phrases tobe of little interest, the probability values connecting these keywordsor key phrases to the concept decrease.

[0050] User Profile and Feedback Interfaces

[0051]FIG. 8 depicts a user interface for defining a user profile inaccordance with one embodiment of the present invention. User interfacescreen 800 is provided by profile editor 518. A profile name box 802permits the user to enter the name of the person or group to whom theprofile to be edited is assigned. This permits the annotation systemaccording to the present invention to be personalized to particularusers or groups. A password box 804 provides security by requiring entryof a correct password prior to profile editing operations.

[0052] A defined concepts list 806 lists all of the concepts which havealready been added to the user profile. By selecting a concept addbutton 808, the user may add a new concept. By selecting a concept editbutton 810, the user may modify the belief network as it pertains to thelisted concept that is currently selected. By selecting a remove button812, the user may delete a concept.

[0053] If a concept has been selected for editing, its name appears in aconcept name box 813. The portion of the belief network pertaining tothe selected concept is shown in a belief network display window 814.Belief network display window 814 shows the selected concept, thesubconcepts which have been defined as relating to the selected conceptand the percentage values associated with each relationship. The usermay add a subconcept by selecting a subconcept add button 815. The usermay edit a subconcept by selecting the subconcept in belief networkdisplay window 814 and then selecting a subconcept edit button 816. Asubconcept remove button 818 permits the user to delete a subconceptfrom the belief network.

[0054] Selecting subconcept add button 815 causes a subconcept addwindow 820 to appear. Subconcept add window 820 includes a subconceptname box 822 for entering the name of a new subconcept. A slider control824 permits the user to select the percentage value that defines theprobability of the selected concept appearing given that the newlyselected subconcept appears. A keyword list 826 lists the keywords andkey phrases which indicate discussion of the subconcept. The user addsto the list by selecting a keyword add button 828 which causes displayof a dialog box (not shown) for entering the new keyword or key phrase.The user deletes a keyword or key phrase by selecting it and thenselecting a keyword delete button 830. Once the user has finisheddefining the new subconcept, he or she confirms the definition byselecting an OK button 832. Selection of a cancel button 834 dismissessubconcept add window 820 without affecting the belief network contentsor structure. Selection of subconcept edit button 816 causes display ofa window similar to subconcept add window 820 permitting redefinition ofthe selected subconcept.

[0055] By selecting whether a background learning checkbox 836 has beenselected, the user may enable or disable the operation of profileupdating stage 624. A web autofetch check box 838 permits the user toselect whether or not to enable an automatic web search process. Whenthis web search process is enabled, whenever a particular keyword or keyphrase is found frequently near where a defined concept is determined tobe discussed, a web search tool such as AltaVista™ is employed to lookon the World Wide Web for documents containing the keyword or keyphrase. A threshold slider control 840 is provided to enable the user toset a threshold relevance level for this autofetching process.

[0056] FIGS. 9A-9B depict a user interface for providing feedback inaccordance with one embodiment of the present invention. User 502 mayselect any text and call up a first feedback window 902. The text may ormay not have been previously identified by the annotation system asrelevant. In first feedback window 902 shown in FIG. 9A, user 504 mayindicate the concept to which the selected text is relevant. Firstfeedback window 902 may not be necessary when adjusting the relevancelevel for a keyword or key phrase that is already a part of beliefnetwork 700. After the user selects a concept in first feedback window902, a second feedback window 904 is displayed for selecting the degreeof relevance. Second feedback window 904 in FIG. 9B provides threechoices for level of relevance: good, medium (not sure), and bad.Alternatively, a slider control could be used to set the level ofrelevance. If the selected text is not already a keyword or key phrasein belief network 700, a new subconcept is added along with theassociated new keyword or key phrase. If the selected text is already akeyword or key phrase, above, probability values within belief system622 are modified appropriately in response to this user feedback.

[0057]FIG. 10 depicts a portion of an HTML document 1000 processed inaccordance with one embodiment of the present invention. A sentenceincluding relevant text is preceded by an a <RH.ANOH.S . . . > tag 1002and followed by an </RH.ANOH.S > tag 1004. The use of these tagsfacilitates the annotation mode where complete sentences arehighlighted. The <RH.ANOH.S . . . > tag 1002 includes a numberindicating which relevant sentence is tagged in order of appearance inthe document. Relevant text within a so-tagged relevant sentence ispreceded by an <RH. ANOH . . . > tag 1006 and followed by an </RH.ANOH>tag 1008. The <RH.ANOH . . . > 1006 tag include the names of the conceptand subconcept to which the annotated text is relevant, an identifierindicating which relevant sentence the text is in and a number whichidentifies which annotation this is in sequence for a particularconcept. An HTML browser that has not been modified to interpret thespecial annotation tags provided by the present invention will ignorethem and display the document without annotations.

[0058] Thumbnail Image Display

[0059] Referring again to FIGS. 2A-2D, an elongated thumbnail image 214of many pages, or all of document 502 is presented in second viewingarea 215. Document 502 will typically be a multi-page document with asection being displayed in first viewing area 202. Elongated thumbnailimage 214 provides a convenient view of the basic document structure.The annotations incorporated into the document are visible withinelongated thumbnail image 214. Within elongated thumbnail image 214, anemphasized area 214A shows a reduced view of the document sectioncurrently displayed in first viewing area 215 with the reduction ratiopreferably being user-configurable. Thus, if the first viewing area 202changes in size because of a change of window size, emphasized area 214Awill also change in size accordingly. The greater the viewing areaallocated to elongated thumbnail image 214 and emphasized area 214A, themore detail is visible. With very small allocated viewing areas, onlysections of the document may be distinguishable. As the allocated areaincreases, individual lines and eventually individual words becomedistinguishable. In FIGS. 2A-2D the user-configured ratio isapproximately 5:1. Emphasized viewing area 214 may be understood to be alens or a viewing window over the part of elongated thumbnail image 214Acorresponded to the document section displayed in first viewing area215. User 504 may scroll through document 502 by sliding emphasized area214A up and down. As emphasized area 214A shifts, the section ofdocument 502 displayed in first viewing area 202 will also shift. User504 may also scroll conventionally using scroll bar 204 or arrow keysand emphasized area 214A will slide up or down as appropriate inresponse.

[0060] In FIGS. 2A-2C elongated thumbnail image 214 displays each pageof document 502 as being displayed at the same reduced scale. Thepresent invention also contemplates other modes of scaling elongatedthumbnail image 214. For example, one may display emphasized area 214Aat a scale similar to that shown in FIGS. 2A-2C and use a variable scalefor the rest of elongated thumbnail image 214. Text from far awayemphasized area 214A would be displayed at a highly reduced scale andthe degree of magnification would increase with nearness to emphasizedarea 214A.

[0061] Because, the annotations appear in enlongated thumbnail image214, it is very easy to find relevant text anywhere in document 502.Furthermore, elongated thumbnail image 214 provides a highly useful wayof keeping track of one's position within a lengthy document.

[0062] Software Implementation

[0063] In a preferred embodiment, software to implement the presentinvention is written in the Java language. Preferably, the softwareforms a part of a stand-alone browser program written in the Javalanguage. Alternatively, the code may be in the form of a so-called“plug-in” operating with a Java-equipped web browser used to browse HTMLdocuments including the special annotation tags explained above.

[0064] In the foregoing specification, the invention has been describedwith reference to specific exemplary embodiments thereof. For example,any probabilistic inference method may be substituted for a Bayesianbelief network. It will, however, be evident that various modificationsand changes may be made thereunto without departing from the broaderspirit and scope of the invention as set forth in the appended claimsand their full scope of equivalents.

What is claimed is:
 1. A computer-implemented method for annotating anelectronically stored document comprising the steps of: accepting userinput indicating a user-specified concept of interest; analyzing saidelectronic document to identify locations of discussion of saiduser-specified concept of interest; and displaying said electronicdocument with visual indications of said identified locations.
 2. Themethod of claim 1 wherein said analyzing step comprises exploiting aprobabilistic inference method to identify said locations.
 3. The methodof claim 2 wherein said probabilistic inference method comprises aBayesian belief network.
 4. The method of claim 2 further comprising thestep of: accepting user input defining a structure of said Bayesianbelief network.
 5. The method of claim 4 further comprising the step of:modifying said Bayesian belief network in accordance with content ofpreviously visited electronic documents.
 6. The method of claim 1wherein said displaying step comprises the substep of: highlightingsections of said document surrounding said locations.
 7. The method ofclaim 1 wherein said displaying step comprises the substep of:displaying a balloon pointing to a user-selected one of said locations,said balloon identifying said user-specified concept to which text insaid user-selected one of said locations is relevant.
 8. The method ofclaim 1 wherein said displaying step comprises the substep of:displaying marginal notation identifying said locations.
 9. The methodof claim 3 further comprising the steps of: accepting user inputindicating a degree of relation between said locations and said conceptof interest; and modifying said Bayesian belief network responsive tosaid degree of relation.
 10. The method of claim 1 further comprisingthe step of displaying a level of relevance of said document to saidconcept of interest.
 11. A computer-implemented method for displaying amultipage document comprising the steps of: displaying an elongatedthumbnail image of a multi-page document in a first viewing area of adisplay; displaying a section of said multi-page document in a secondviewing area of said display in legible form; emphasizing an area ofsaid thumbnail image corresponding to said section displayed in saidsecond viewing area; accepting user input controlling sliding saidemphasized area through said multi-page document; and scrolling saiddisplayed section in said second viewing area responsive to said slidingso that said emphasized area continues to correspond to said displayedsection.
 12. The method of claim 11 further comprising the steps of:accepting user input indicating user-specific concepts of interest;analyzing said multi-page document to identify locations of discussionof said user-specific concepts of interest; marking said locations inboth said thumbnail image and in said displayed section in said secondviewing area.
 13. A computer program product for annotating anelectronically stored document comprising: code for accepting user inputindicating a user-specified concept of interest; code for analyzing saidelectronic document to identify locations of discussion of saiduser-specified concepts of interest; code for displaying said electronicdocument with visual indications of said identified locations; and acomputer-readable storage medium for storing the codes.
 14. The productof claim 13 wherein said analyzing code comprises code for exploiting aprobabilistic inference method to identify said locations.
 15. Theproduct of claim 14 wherein said probabilistic inference methodcomprises a Bayesian belief network.
 16. The product of claim 15 furthercomprising code for: accepting user input defining a structure of saidBayesian belief network.
 17. The product of claim 16 further comprisingcode for modifying said Bayesian belief network in accordance withcontent of said electronic document.
 18. The product of claim 17 whereinsaid modifying code comprises code for updating said Bayesian beliefnetwork in accordance with proximity of keywords to said identifiedlocations.
 19. The product of claim 13 wherein said displaying codecomprises code for highlighting said locations.
 20. The product of claim13 wherein said displaying code comprises code for highlighting sectionsof said document surrounding said locations.
 21. The product of claim 13wherein said displaying code comprises code for displaying balloonspointing to said locations.
 22. The product of claim 13 wherein saiddisplaying code comprises code for displaying marginal notationsidentifying said locations.
 23. The product of claim 15 furthercomprising: code for accepting user input indicating a degree ofrelation between said locations and said concepts of interest; and codefor modifying said Bayesian belief network responsive to said degree ofrelation.
 24. The product of claim 1 further comprising code fordisplaying a level of relevance of said document to said concept ofinterest.
 25. A computer program product for displaying a multipagedocument comprising: code for displaying an elongated thumbnail image ofa multi-page document in a first viewing area of a display; code fordisplaying a section of said multi-page document in a second viewingarea of said display in legible form; code for emphasizing an area ofsaid thumbnail image corresponding to said section displayed in saidsecond viewing area; code for accepting user input controlling slidingof said emphasized area through said thumbnail image; code for scrollingsaid displayed section so that said displayed section continues tocorrespond to said emphasized area; and a computer-readable storagemedium for storing the codes.
 26. The computer program product of claim25 further comprising: code for accepting user input indicatinguser-specific concepts of interest; code for analyzing said multi-pagedocument to identify locations of discussion of said user-specificconcepts of interest; and code for marking said locations in both saidthumbnail image and in said displayed section in said second viewingarea.
 27. A computer system comprising: a processor; and acomputer-readable storage medium storing code to be executed by saidprocessor, said code comprising: code for accepting user inputindicating user-specific concepts of interest; code for analyzing anelectronic document to identify locations of discussion of saiduser-specific concepts of interest; and code for displaying saidelectronic document with visual indications of said identifiedlocations.
 28. A computer program product for displaying a multipagedocument comprising: code for displaying an elongated thumbnail image ofa multi-page document in a first viewing area of a display; code fordisplaying a section of said multi-page document in a second viewingarea of said display in legible form; code for emphasizing an area ofsaid thumbnail image corresponding to said section displayed in saidsecond viewing area; code for accepting user input indicatinguser-specific concepts of interest; code for analyzing said multi-pagedocument to identify locations of discussion of said user-specificconcepts of interest; and code for marking said locations in both saidthumbnail image and in said displayed section in said second viewingarea.